The GA Guard series is an open-source weight review model designed to help developers and organizations maintain the security, compliance, and consistency of language models with the real world. This model can detect seven violation categories, including illegal activities, hate and abuse, personal identity information and intellectual property, prompt security, pornographic content, false information, and violence and self-harm.
Natural Language Processing
TransformersEnglish